AITopics | larger network

Collaborating Authors

larger network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations. Existing gradient estimation methods that instead use forward-mode automatic differentiation struggle to scale beyond small networks due to the high variance of the estimates. Efforts to mitigate this have so far introduced significant bias to the estimates, reducing their utility. We introduce a gradient estimation approach that reduces both bias and variance by manipulating upstream Jacobian matrices when computing guess directions. It shows promising results and has the potential to scale to larger networks, indeed performing better as the network width is increased. Our understanding of this method is facilitated by analyses of bias and variance, and their connection to the low-dimensional structure of neural network gradients.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.0311

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

We further evaluate our model on five UCI

Neural Information Processing SystemsOct-2-2025, 11:08:55 GMT

We thank all the reviewers for the valuable comments and suggestions. Feature normalization is applied in the experiments. MLP with one hidden layer of 50 units. We appreciate the suggestions on writing and are to fix them in the future revision. We acknowledge over-parameterization may fit some real applications better under certain scenarios.

artificial intelligence, experiment, machine learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

detailed comments

Neural Information Processing SystemsAug-16-2025, 09:20:25 GMT

We thank all reviewers for their valuable comments. We'll further improve in the final version. Q1: Beyond regression tasks: In this paper, we focus on regression tasks. It's indeed an interesting topic to systematically investigate on the robustness of We note that there is some recent work (e.g., [*1]) that studies the robustness of the MMD estimators, Q3. Results of larger networks: Guo et al. (ICML 2017) argued that the miscalibration was due to the sheer size of We'll make it clearer in the final version.

detailed comment, final version, regression task, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

Growing with Experience: Growing Neural Networks in Deep Reinforcement Learning

Fehring, Lukas, Lindauer, Marius, Eimer, Theresa

arXiv.org Artificial IntelligenceJun-16-2025

While increasingly large models have revolutionized much of the machine learning landscape, training even mid-sized networks for Reinforcement Learning (RL) is still proving to be a struggle. This, however, severely limits the complexity of policies we are able to learn. To enable increased network capacity while maintaining network trainability, we propose GrowNN, a simple yet effective method that utilizes progressive network growth during training. We start training a small network to learn an initial policy. Then we add layers without changing the encoded function. Subsequent updates can utilize the added layers to learn a more expressive policy, adding capacity as the policy's complexity increases. GrowNN can be seamlessly integrated into most existing RL agents. Our experiments on MiniHack and Mujoco show improved agent performance, with incrementally GrowNN-deeper networks outperforming their respective static counterparts of the same size by up to 48% on MiniHack Room and 72% on Ant.

machine learning, proc, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2506.11706

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: GradAug: A New Regularization Method for Deep Neural Networks

Neural Information Processing SystemsJan-27-2025, 05:51:47 GMT

Summary and Contributions: After rebuttal and discussion with other reviewers I have updated my score. However, I do point out several concerns of mine which the authors could consider further validation for: It's good that the authors performed the time/memory comparison in the rebuttal as that was a significant concern of mine. My concerns mostly revolve around what other techniques should we compare this against? Given that this algorithm takes 3-4x the time with comparison to the baseline, I could for example: 1: Train a much larger network and then use compression techniques to slim it to the same size. Mixup which is still 70% faster.

gradaug, larger network, new regularization method, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Dissecting a Small Artificial Neural Network

Yang, Xiguang, Arora, Krish, Bachmann, Michael

arXiv.org Artificial IntelligenceJan-3-2025

We investigate the loss landscape and backpropagation dynamics of convergence for the simplest possible artificial neural network representing the logical exclusive-OR (XOR) gate. Cross-sections of the loss landscape in the nine-dimensional parameter space are found to exhibit distinct features, which help understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting. Differences in shapes of cross-sections obtained by nonrandomized and randomized batches are discussed. In reference to statistical physics we introduce the microcanonical entropy as a unique quantity that allows to characterize the phase behavior of the network. Learning in neural networks can thus be thought of as an annealing process that experiences the analogue of phase transitions known from thermodynamic systems. It also reveals how the loss landscape simplifies as more hidden neurons are added to the network, eliminating entropic barriers caused by finite-size effects.

artificial intelligence, machine learning, neuron, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1751-8121/ad9dc6

2501.08341

Country: North America > United States > Georgia (0.28)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adaptive Neural Networks Using Residual Fitting

Ford, Noah, Winder, John, McClellan, Josh

arXiv.org Artificial IntelligenceJan-13-2023

Current methods for estimating the required neural-network size for a given problem class have focused on methods that can be computationally intensive, such as neural-architecture search and pruning. In contrast, methods that add capacity to neural networks as needed may provide similar results to architecture search and pruning, but do not require as much computation to find an appropriate network size. Here, we present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected. We demonstrate this method using examples from classification, imitation learning, and reinforcement learning. Within these tasks, the growing network can often achieve better performance than small networks that do not grow, and similar performance to networks that begin much larger.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2301.05744

Country: North America > United States > Maryland > Prince George's County > Laurel (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Regularizing Deep Neural Networks

#artificialintelligenceSep-19-2021, 10:20:40 GMT

Lets discuss Regularizing Deep Neural Networks. Deep neural nets with an outsized number of parameters are very powerful machine learning systems. However, overfitting may be a significant issue in such networks. Making it hard to affect over-fitting by associating the predictions of the many different large neural nets at test time, big networks similarly are slow to use. Dropout might be a technique for addressing this problem.

dropout, dropout rate, neural network, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback